This page last changed on May 31, 2012 by prodrigu.

Overview

This page documents the policy to request re-computation of SAM availabilities results for WLCG and EGI. Current policy was approved on 14/02/2012.

Problems affecting the monitoring infrastructure may incorrectly impact the availability results for a given site during a period of time. When these issues are confirmed the test results must be verified or sometimes updated. The availability results are then re-computed so that availability numbers are not affected by the monitoring infrastructure fault.

Conditions to request a re-computation

Requests for re-computation are only accepted:

  • When failures are due to problems in the monitoring infrastructure (invalid proxy certificate, problems in SE used for replica tests, etc);
  • When reported up to 10 calendar days after the announcement of the reports of a given month, which normally occurs on 1st of the following month.

Below are some examples to better explain the conditions above:

  • Example 1
    • On 25-Jan-2012 region A requests the re-computation of the region availability due to a hardware problem on SAM-Nagios which happend on 15-Jan-2012.
    • The request is approved. The justification of the problem is valid and the request was reported on time (before the announcement of the report).
  • Example 2
    • On 05-Feb-2012 site B requests the re-computation of the site availability due to a problem with the host certificate of SAM-Nagios which happend on 15-Jan-2012.
    • The request is approved. The justification of the problem is valid and the problem was reported on time (5 days after the announcement of the first report).
  • Example 3
    • On 20-Feb-2012 region C requests the re-computation of the region availability due to a network problem on SAM-Nagios which happend on 15-Jan-2012.
    • The request is rejected. The problem was reported too late (more then 10 days after the announcement of the first report).

How to request a re-computation

Confirm that the availability of your site/region is affected by browsing the MyWLCG/MyEGI service availability interface and open a new GGUS ticket:

  • OPS VO: For EGI sites and regions, please assign the ticket as described in EGI PROC10;
  • OPS VO: For OSG sites, please assign the ticket to SAM/Nagios Support Unit (3rd level experts).
  • HEP VOS: For WLCG sites, please assign the ticket to SAM/Nagios Support Unit (3rd level experts).

The GGUS ticket must include the following information:

  • A description of the problem;
  • The site or ROC/NGI affected by the problem;
  • The start and end time of the problem (yyyy-mm-ddThh:mm in UTC);
  • The VO affected by the problem;
  • A link to MyWLCG/MyEGI pointing to the problem.

How to follow a re-computation

The deadline for requesting re-computations is 10 calendar days after the announcement of the reports for a given month. Assuming the reports are announced on the 1st of the following month, the deadline for requesting re-computations will be the 11th of that month. As soon as the re-computation is complete the GGUS ticket is closed and the submitter notified. The MyWLCG/MyEGI service availability interface can be used to confirm the new availability numbers. The final report will be published shortly after the deadline.

If there are no requests for re-computation the first reports published at the beginning of the month are considered final reports. In any case, after the deadline, no further requests will be considered.




Document generated by Confluence on Feb 27, 2014 10:19